A Semantic Similarity Approach to Paraphrase Detection
نویسندگان
چکیده
This paper presents a novel approach to the problem of paraphrase identification. Although paraphrases often make use of synonymous or near synonymous terms, many previous approaches have either ignored or made limited use of information about similarities between word meanings. We present an algorithm for paraphrase identification which makes extensive use of word similarity information derived fromWordNet (Fellbaum, 1998). The approach is evaluated using the Microsoft Research Paraphrase Corpus (Dolan et al., 2004), a standard resource for this task, and found to outperform previously published methods.
منابع مشابه
A Paraphrase and Semantic Similarity Detection System for User Generated Short-Text Content on Microblogs
Existing systems deliver high accuracy and F1-scores for detecting paraphrase and semantic similarity on traditional clean-text corpus. For instance, on the clean-text Microsoft Paraphrase benchmark database, the existing systems attain an accuracy as high as 0.8596. However, existing systems for detecting paraphrases and semantic similarity on user-generated short-text content on microblogs su...
متن کاملCUSAT_NLP@DPIL-FIRE2016: Malayalam Paraphrase Detection
This paper describes an approach for paraphrase detection in Malayalam sentences developed as part of FIRE 2016 Shared Task on Paraphrase detection in Indian Languages. The task of paraphrasedetection is finding a sentence with the same meaning of another sentence expressed using same or different words. This detection is done by a semantic approach which is language dependent. Individual words...
متن کاملAMRITA_CEN$@$SemEval-2015: Paraphrase Detection for Twitter using Unsupervised Feature Learning with Recursive Autoencoders
We explore using recursive autoencoders for SemEval 2015 Task 1: Paraphrase and Semantic Similarity in Twitter. Our paraphrase detection system makes use of phrase-structure parse tree embeddings that are then provided as input to a conventional supervised classification model. We achieve an F1 score of 0.45 on paraphrase identification and a Pearson correlation of 0.303 on computing semantic s...
متن کاملThe Study and Review of Paraphrase Detection Techniques in Machine Learning
ABSTARCT: Paraphrase is a process of computing the semantic similarity between sentences, which are not lexicographically similar. Though a number of metrics for English language have been proposed in literature, to quantify textual similarity; it addresses the problem for detection of monolingual text-text lexical similarity. Existing system for Indian Language paraphrase detection uses lexica...
متن کاملASE@DPIL-FIRE2016: Hindi Paraphrase Detection using Natural Language Processing Techniques & Semantic Similarity Computations
The paper reports the approaches utilized and results achieved for our system in the shared task (in FIRE-2016) for paraphrase identification in Indian languages (DPIL). Since Indian languages have a complex inherent nature, paraphrase identification in these languages becomes a challenging task. In the DPIL task, the challenge is to detect and identify whether a given sentence pairs paraphrase...
متن کامل